Estimation Methods for the Size of Deep Web Textural Data Source: A Survey
نویسنده
چکیده
The estimation of the size of deep web data sources has been an open problem since 1998. This survey reviews all papers that were available online, and other, resources, on estimating the size of data sources during the period 1998 to 2008. In the survey, we first clarify several basic terms that are used in the survey but whose meanings vary in the literature. Basic models in the literature on estimation are also discussed. The survey introduces query-based sampling approaches and reviews the estimation methods of estimating relative size and actual size of data source(s). Querybased sampling is biased. The survey also reviews research on overcoming biases caused by various estimation methods. Finally, the future direction of estimation is discussed.
منابع مشابه
DASTWAR: a tool for completeness estimation in magnitude-size plane
Today, great observatories around the world, devote a substantial amount of observing time to sky surveys. The resulted images are inputs of source finder modules. These modules search for the target objects and provide us with source catalogues. We sought to quantify the ability of detection tools in recovering faint galaxies regularly encountered in deep surveys. Our approach was based on com...
متن کاملبهبود تخمین منحنی مشخصه آب - خاک با استفاده از منحنی دانهبندی و چگالی ظاهری خاک
Soil particle size distribution and bulk density are used for estimating soil-moisture characteristic curve. In this model, soil particle size distribution curve is divided into a number of segments, each with a specific particle radius and cumulative percentage of the particles greater than that radius. Using these data, soil-moisture characteristic curve is estimated. In the model a scale f...
متن کاملبهبود تخمین منحنی مشخصه آب - خاک با استفاده از منحنی دانهبندی و چگالی ظاهری خاک
Soil particle size distribution and bulk density are used for estimating soil-moisture characteristic curve. In this model, soil particle size distribution curve is divided into a number of segments, each with a specific particle radius and cumulative percentage of the particles greater than that radius. Using these data, soil-moisture characteristic curve is estimated. In the model a scale f...
متن کاملSPOT-5 Spectral and Textural Data Fusion for Forest Mean Age and Height Estimation
Precise estimation of the forest structural parameters supports decision makers for sustainable management of the forests. Moreover, timber volume estimation and consequently the economic value of a forest can be derived based on the structural parameter quantization. Mean age and height of the trees are two important parameters for estimating the productivity of the plantations. This research ...
متن کاملRanking bias in deep web size estimation using capture recapture method
Many deep web data sources are ranked data sources, i.e., they rank the matched documents and return at most the top k number of results even though there are more than k documents matching the query. While estimating the size of such ranked deep web data source, it is well known that there is a ranking bias– the traditional methods tend to underestimate the size when queries overflow ( match m...
متن کامل